Overview
The fetch_market_news.py script retrieves real-time market news for each stock with AI-powered sentiment analysis (positive/negative/neutral). News items are sourced from financial media and automatically categorized by sentiment.
Purpose
Fetches market news including:
- Latest news headlines and summaries
- AI-generated sentiment scores (positive/negative/neutral)
- Publication timestamps
- Source categories
- Configurable news limit per stock (default: 50)
API Endpoint
https://news-live.dhan.co/v2/news/getLiveNews
Request Payload
{
"categories": ["ALL"],
"page_no": 0,
"limit": 50,
"first_news_timeStamp": 0,
"last_news_timeStamp": 0,
"news_feed_type": "live",
"stock_list": ["<ISIN>"],
"entity_id": ""
}
Parameters
News categories to fetch. Use ["ALL"] for all categories.
Page number for pagination (0-indexed)
Number of news items to retrieve per stock. Maximum tested: 100.
Start timestamp filter (0 = no filter)
End timestamp filter (0 = no filter)
Type of news feed (live or historical)
Array of ISIN codes (typically one ISIN per request)
Optional entity identifier
Output Files
market_news/{SYMBOL}_news.json
Per-stock news data with structure:{
"Symbol": "RELIANCE",
"ISIN": "INE002A01018",
"News": [
{
"Title": "Reliance Q3 results beat estimates",
"Summary": "Full news text summary...",
"Sentiment": "positive",
"PublishDate": 1705334400,
"Source": "Business News"
}
]
}
Each stock gets up to 50 news items (configurable via NEWS_LIMIT).
Function Signature
def fetch_market_news(item):
"""
Fetches market news for a single stock.
Args:
item (dict): Stock object with 'Symbol' and 'ISIN' keys
Returns:
str: Status - "success", "empty", "rate_limit", or "error"
Process:
1. Construct payload with stock's ISIN
2. POST request to news API
3. Extract and process news items
4. Save to market_news/{SYMBOL}_news.json
"""
Dependencies
requests - HTTP client
json - JSON processing
os - File operations
time - Rate limit backoff
concurrent.futures.ThreadPoolExecutor - Parallel execution
pipeline_utils.BASE_DIR - Base directory path
pipeline_utils.get_headers() - Standard API headers
master_isin_map.json - ISIN to Symbol mapping
Threading Configuration
Number of concurrent threads. Set to 15 to avoid overwhelming the news API.
Number of news items to fetch per stock (max tested: 100)
Code Example
import json
import requests
import os
import time
from concurrent.futures import ThreadPoolExecutor, as_completed
from pipeline_utils import BASE_DIR, get_headers
INPUT_FILE = os.path.join(BASE_DIR, "master_isin_map.json")
OUTPUT_DIR = os.path.join(BASE_DIR, "market_news")
MAX_THREADS = 15
NEWS_LIMIT = 50
if not os.path.exists(OUTPUT_DIR):
os.makedirs(OUTPUT_DIR)
def fetch_market_news(item):
symbol = item.get("Symbol")
isin = item.get("ISIN")
if not symbol or not isin:
return None
output_path = os.path.join(OUTPUT_DIR, f"{symbol}_news.json")
url = "https://news-live.dhan.co/v2/news/getLiveNews"
payload = {
"categories": ["ALL"],
"page_no": 0,
"limit": NEWS_LIMIT,
"first_news_timeStamp": 0,
"last_news_timeStamp": 0,
"news_feed_type": "live",
"stock_list": [isin],
"entity_id": ""
}
headers = get_headers()
try:
response = requests.post(url, json=payload, headers=headers, timeout=10)
if response.status_code == 200:
data = response.json()
news_items = data.get("data", {}).get("latest_news", [])
if news_items:
processed_news = []
for news in news_items:
news_obj = news.get("news_object", {})
processed_news.append({
"Title": news_obj.get("title", ""),
"Summary": news_obj.get("text", ""),
"Sentiment": news_obj.get("overall_sentiment", "neutral"),
"PublishDate": news.get("publish_date", 0),
"Source": news.get("category", "")
})
final_output = {"Symbol": symbol, "ISIN": isin, "News": processed_news}
with open(output_path, "w") as f:
json.dump(final_output, f, indent=4)
return "success"
else:
return "empty"
elif response.status_code == 429:
time.sleep(2) # Rate limit backoff
return "rate_limit"
else:
return f"http_{response.status_code}"
except Exception as e:
return "error"
def main():
with open(INPUT_FILE, "r") as f:
stock_list = json.load(f)
total = len(stock_list)
print(f"Starting Market News Fetch (Limit: {NEWS_LIMIT}) for {total} stocks...")
with ThreadPoolExecutor(max_workers=MAX_THREADS) as executor:
future_to_stock = {executor.submit(fetch_market_news, item): item["Symbol"] for item in stock_list}
for future in as_completed(future_to_stock):
result = future.result()
# Handle result
Usage
python3 fetch_market_news.py
- Execution Time: ~4-6 minutes for 2,775 stocks
- API Calls: 2,775 requests (one per stock)
- Output: 2,775 individual JSON files in
market_news/ directory
- Concurrency: 15 parallel threads
- News per Stock: 50 items (configurable)
Rate Limiting
- Handles HTTP 429 (rate limit) responses automatically
- Implements 2-second backoff on rate limit detection
- Returns “rate_limit” status for monitoring
- 10-second timeout per request
Sentiment Analysis
News items include AI-generated sentiment classification:
- positive - Bullish/favorable news
- negative - Bearish/unfavorable news
- neutral - Non-directional/informational news
Notes
- Automatically creates
market_news/ directory if it doesn’t exist
- News is fetched fresh on every run (no caching)
- Sentiment scores are pre-computed by Dhan’s AI engine
- Maximum tested limit: 100 news items per stock
- Use
page_no parameter for pagination if needed